{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## 神经网络\n",
    "\n",
    "PyTorch中构建神经网络是通过`torch.nn`包。`nn`是建立在`autograd`的基础上，一个`nn.Module`包括了层(layers)和`forward(input)`方法。\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "一个典型的神经网络训练过程可分为以下步骤：\n",
    "\n",
    "- 定义神经网络和对应的可学习参数\n",
    "- 迭代输入数据\n",
    "- 将输入传入网络中\n",
    "- 计算loss\n",
    "- 传播梯度到网络的参数\n",
    "- 更新网络参数，通常地使用一个简单的规则：$$weight=weight-learn\\_rate*gradient$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 定义网络"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Net(\n",
      "  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))\n",
      "  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))\n",
      "  (fc1): Linear(in_features=400, out_features=120, bias=True)\n",
      "  (fc2): Linear(in_features=120, out_features=84, bias=True)\n",
      "  (fc3): Linear(in_features=84, out_features=10, bias=True)\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "from torch.autograd import Variable\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "\n",
    "class Net(nn.Module):\n",
    "\n",
    "    def __init__(self):\n",
    "        super(Net, self).__init__()\n",
    "        # 1 input image channel, 6 output channels, 5x5 square convolution\n",
    "        # kernel\n",
    "        self.conv1 = nn.Conv2d(1, 6, 5)\n",
    "        self.conv2 = nn.Conv2d(6, 16, 5)\n",
    "        # an affine operation: y = Wx + b\n",
    "        self.fc1 = nn.Linear(16 * 5 * 5, 120)\n",
    "        self.fc2 = nn.Linear(120, 84)\n",
    "        self.fc3 = nn.Linear(84, 10)\n",
    "\n",
    "    def forward(self, x):\n",
    "        # Conv-->ReLU-->maxpool\n",
    "        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2)) #Max pooling stride=(2, 2)\n",
    "        # 如果卷积核是正方形，可以指定一个值\n",
    "        x = F.max_pool2d(F.relu(self.conv2(x)), 2)\n",
    "        x = x.view(-1, self.num_flat_features(x)) # view就是reshape操作，相当于flatten\n",
    "        x = F.relu(self.fc1(x))\n",
    "        x = F.relu(self.fc2(x))\n",
    "        x = self.fc3(x)\n",
    "        return x\n",
    "\n",
    "    def num_flat_features(self, x):\n",
    "        '''计算张量中除了第一个维度，其他维度所有元素值'''\n",
    "        size = x.size()[1:]  # 取出除了batch的其他维度\n",
    "        num_features = 1\n",
    "        for s in size:\n",
    "            num_features *= s\n",
    "        return num_features\n",
    "\n",
    "net = Net()\n",
    "print(net)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "模型的可学习参数可通过`net.parameters()`获取:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10\n",
      "torch.Size([6, 1, 5, 5])\n"
     ]
    }
   ],
   "source": [
    "params = list(net.parameters())\n",
    "print(len(params))\n",
    "print(params[0].size())  # conv1's .weight"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "前向传播的输入和输出都是`autograd.Variable`。设置的输入为32x32，使用MNIST训练时将输入放缩到32x32.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Variable containing:\n",
      "1.00000e-02 *\n",
      " -2.5047  0.7743  0.4515  3.1252 -9.7592  1.3915  4.4318 -7.4003 -9.2446  2.8687\n",
      "[torch.FloatTensor of size 1x10]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "input = Variable(torch.randn(1, 1, 32, 32))\n",
    "out = net(input)\n",
    "print(out)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "注意`torch.nn`只支持Mini-batch，不支持单个样本，对于`nn.Conv2d`接收一个4D的张量`(nSample,nChannel,Height,Width)`。如果只有一个样本，使用`input.unsqueeze(0)`添加一个假的batch张量。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([3, 32, 32])\n",
      "torch.Size([1, 3, 32, 32])\n"
     ]
    }
   ],
   "source": [
    "inputz = Variable(torch.randn(3, 32, 32))\n",
    "print(inputz.size())\n",
    "inputz.unsqueeze_(0) # 添加一个假的维度\n",
    "print(inputz.size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Loss Function\n",
    "loss函数以(output,target)作为输入，计算输出和目标值之间的差距。在`nn`包下有几种不同的loss函数，例如`nn.MSELoss`就是平均方误差："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Variable containing:\n",
      " 38.7233\n",
      "[torch.FloatTensor of size 1]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "output = net(input)\n",
    "target = Variable(torch.arange(1, 11))  # a dummy target, for example\n",
    "criterion = nn.MSELoss()\n",
    "\n",
    "loss = criterion(output, target)\n",
    "print(loss)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果调用`loss.backward()`，整个图都在做微分，graph中所有的变量的`.grad`计算梯度。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<MseLossBackward object at 0x000001F1D48FFC88>\n",
      "<AddmmBackward object at 0x000001F1D48FFC88>\n",
      "<ExpandBackward object at 0x000001F1D48C9DD8>\n"
     ]
    }
   ],
   "source": [
    "print(loss.grad_fn)  # MSELoss\n",
    "print(loss.grad_fn.next_functions[0][0])  # Linear\n",
    "print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### BackProp\n",
    "为了误差反向传播，我们要使用`loss.backward()`，还要先清除已存在的梯度。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "conv1.bias.grad before backward\n",
      "None\n",
      "conv1.bias.grad after backward\n",
      "Variable containing:\n",
      " 0.0744\n",
      " 0.0178\n",
      "-0.0417\n",
      "-0.1284\n",
      " 0.0986\n",
      "-0.0333\n",
      "[torch.FloatTensor of size 6]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "net.zero_grad()     # 清除所有的梯度\n",
    "\n",
    "print('conv1.bias.grad before backward')\n",
    "print(net.conv1.bias.grad)\n",
    "\n",
    "loss.backward() \n",
    "\n",
    "print('conv1.bias.grad after backward')\n",
    "print(net.conv1.bias.grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Update the weights\n",
    "\n",
    "一个最简单的更新规则是 Stochastic Gradient Descent (SGD):$$weight = weight - learning\\_rate * gradient$$\n",
    "\n",
    "在使用神经网络时，我们希望使用不同的优化器，例如SGD, Nesterov-SGD, Adam, RMSProp, etc. PyTorch提供了`torch.optim`用于实现这些方法："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import torch.optim as optim\n",
    "\n",
    "# create your optimizer\n",
    "optimizer = optim.SGD(net.parameters(), lr=0.01)\n",
    "\n",
    "# in your training loop:\n",
    "optimizer.zero_grad()   # zero the gradient buffers\n",
    "output = net(input)\n",
    "loss = criterion(output, target)\n",
    "loss.backward()\n",
    "optimizer.step()    # Does the update"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
